Search CORE

10 research outputs found

Learning From Drift: Federated Learning on Non-IID Data via Drift Regularization

Author: Kim Yeachan
Shin Bonggun
Publication venue
Publication date: 13/09/2023
Field of study

Federated learning algorithms perform reasonably well on independent and identically distributed (IID) data. They, on the other hand, suffer greatly from heterogeneous environments, i.e., Non-IID data. Despite the fact that many research projects have been done to address this issue, recent findings indicate that they are still sub-optimal when compared to training on IID data. In this work, we carefully analyze the existing methods in heterogeneous environments. Interestingly, we find that regularizing the classifier's outputs is quite effective in preventing performance degradation on Non-IID data. Motivated by this, we propose Learning from Drift (LfD), a novel method for effectively training the model in heterogeneous settings. Our scheme encapsulates two key components: drift estimation and drift regularization. Specifically, LfD first estimates how different the local model is from the global model (i.e., drift). The local model is then regularized such that it does not fall in the direction of the estimated drift. In the experiment, we evaluate each method through the lens of the five aspects of federated learning, i.e., Generalization, Heterogeneity, Scalability, Forgetting, and Efficiency. Comprehensive evaluation results clearly support the superiority of LfD in federated learning with Non-IID data

arXiv.org e-Print Archive

Classification of Radiology Reports Using Neural Attention Models

Author: Choi Jinho D.
Chokshi Falgun H.
Lee Timothy
Shin Bonggun
Publication venue
Publication date: 22/08/2017
Field of study

The electronic health record (EHR) contains a large amount of multi-dimensional and unstructured clinical data of significant operational and research value. Distinguished from previous studies, our approach embraces a double-annotated dataset and strays away from obscure "black-box" models to comprehensive deep learning models. In this paper, we present a novel neural attention mechanism that not only classifies clinically important findings. Specifically, convolutional neural networks (CNN) with attention analysis are used to classify radiology head computed tomography reports based on five categories that radiologists would account for in assessing acute and communicable findings in daily practice. The experiments show that our CNN attention models outperform non-neural models, especially when trained on a larger dataset. Our attention analysis demonstrates the intuition behind the classifier's decision by generating a heatmap that highlights attended terms used by the CNN model; this is valuable when potential downstream medical decisions are to be performed by human experts or the classifier information is to be used in cohort construction such as for epidemiological studies

arXiv.org e-Print Archive

Crossref

Boosting Convolutional Neural Networks' Protein Binding Site Prediction Capacity Using SE(3)-invariant transformers, Transfer Learning and Homology-based Augmentation

Author: Byun Jeunghyun
Lee Daeseok
Shin Bonggun
Publication venue
Publication date: 18/04/2023
Field of study

Figuring out small molecule binding sites in target proteins, in the resolution of either pocket or residue, is critical in many virtual and real drug-discovery scenarios. Since it is not always easy to find such binding sites based on domain knowledge or traditional methods, different deep learning methods that predict binding sites out of protein structures have been developed in recent years. Here we present a new such deep learning algorithm, that significantly outperformed all state-of-the-art baselines in terms of the both resolutions\unicode{x2013}pocket and residue. This good performance was also demonstrated in a case study involving the protein human serum albumin and its binding sites. Our algorithm included new ideas both in the model architecture and in the training method. For the model architecture, it incorporated SE(3)-invariant geometric self-attention layers that operate on top of residue-level CNN outputs. This residue-level processing of the model allowed a transfer learning between the two resolutions, which turned out to significantly improve the binding pocket prediction. Moreover, we developed novel augmentation method based on protein homology, which prevented our model from over-fitting. Overall, we believe that our contribution to the literature is twofold. First, we provided a new computational method for binding site prediction that is relevant to real-world applications, as shown by the good performance on different benchmarks and case study. Second, the novel ideas in our method\unicode{x2013}the model architecture, transfer learning and the homology augmentation\unicode{x2013}would serve as useful components in future works.Comment: Updates in version 2: author order change (making it clear that Bonggun Shin is the corresponding author

arXiv.org e-Print Archive

Phase-shifted Adversarial Training

Author: Kim Seongyeon
Kim Yeachan
Seo Ihyeok
Shin Bonggun
Publication venue
Publication date: 23/08/2023
Field of study

Adversarial training has been considered an imperative component for safely deploying neural network-based applications to the real world. To achieve stronger robustness, existing methods primarily focus on how to generate strong attacks by increasing the number of update steps, regularizing the models with the smoothed loss function, and injecting the randomness into the attack. Instead, we analyze the behavior of adversarial training through the lens of response frequency. We empirically discover that adversarial training causes neural networks to have low convergence to high-frequency information, resulting in highly oscillated predictions near each data. To learn high-frequency contents efficiently and effectively, we first prove that a universal phenomenon of frequency principle, i.e., \textit{lower frequencies are learned first}, still holds in adversarial training. Based on that, we propose phase-shifted adversarial training (PhaseAT) in which the model learns high-frequency components by shifting these frequencies to the low-frequency range where the fast convergence occurs. For evaluations, we conduct the experiments on CIFAR-10 and ImageNet with the adaptive attack carefully designed for reliable evaluation. Comprehensive results show that PhaseAT significantly improves the convergence for high-frequency information. This results in improved adversarial robustness by enabling the model to have smoothed predictions near each data.Comment: Conference on Uncertainty in Artificial Intelligence, 2023 (UAI 2023

arXiv.org e-Print Archive

Improving Evidential Deep Learning via Multi-Task Learning

Author: Oh Dongpin
Shin Bonggun
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 17/12/2021
Field of study

The Evidential regression network (ENet) estimates a continuous target and its predictive uncertainty without costly Bayesian model averaging. However, it is possible that the target is inaccurately predicted due to the gradient shrinkage problem of the original loss function of the ENet, the negative log marginal likelihood (NLL) loss. In this paper, the objective is to improve the prediction accuracy of the ENet while maintaining its efficient uncertainty estimation by resolving the gradient shrinkage problem. A multi-task learning (MTL) framework, referred to as MT-ENet, is proposed to accomplish this aim. In the MTL, we define the Lipschitz modified mean squared error (MSE) loss function as another loss and add it to the existing NLL loss. The Lipschitz modified MSE loss is designed to mitigate the gradient conflict with the NLL loss by dynamically adjusting its Lipschitz constant. By doing so, the Lipschitz MSE loss does not disturb the uncertainty estimation of the NLL loss. The MT-ENet enhances the predictive accuracy of the ENet without losing uncertainty estimation capability on the synthetic dataset and real-world benchmarks, including drug-target affinity (DTA) regression. Furthermore, the MT-ENet shows remarkable calibration and out-of-distribution detection capability on the DTA benchmarks

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Knowledge-oriented Hierarchical Neural Network for Sentiment Classification

Author: Baccianella Stefano
Bahdanau Dzmitry
Blitzer J.
dos Santos Cicero Nogueira
Frege Gottlob
Ghosh Monalisa
Hamouda Alaa
Kingma Diederik P.
Maas A. L.
Mikolov Tomas
Miller G. A.
Nandi Vikash
Pang Bo
Pengfei Li
Prusa Joseph D.
Ruder Sebastian
Shin Bonggun
Srivastava Nitish
Taboada Maite
Turney Peter D.
Wang Xingyou
Yang Zichao
Yanliu Wang
Zhang Lei
Publication venue: 'IOP Publishing'
Publication date
Field of study

Crossref